Automatic accentedness evaluation of non-native speech using phonetic and sub-phonetic posterior probabilities
نویسندگان
چکیده
Automatic evaluation of non-native speech accentedness has potential implications for not only language learning and accent identification systems but also for speaker and speech recognition systems. From the perspective of speech production, the two primary factors influencing the accentedness are the phonetic and prosodic structure. In this paper, we propose an approach for automatic accentedness evaluation based on comparison of instances of native and non-native speakers at the acoustic-phonetic level. Specifically, the proposed approach measures accentedness by comparing phone class conditional probability sequences corresponding to the instances of native and non-native speakers, respectively. We evaluate the proposed approach on the EMIME bilingual and EMIME Mandarin bilingual corpora, which contains English speech from native English speakers and various non-native English speakers, namely Finnish, German and Mandarin. We also investigate the influence of the granularity of the phonetic unit representation on the performance of the proposed accentedness measure. Our results indicate that the accentedness ratings by the proposed approach correlate consistently with the human ratings of accentedness. In addition, our studies show that the granularity of the phonetic unit representation that yields the best correlation with the human accentedness ratings varies with respect to the native language of the non-native speakers.
منابع مشابه
Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures
This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors’ intrinsic sparsity structures. The posterior probabilities are estimated for phonetic and phonological classes using deep neural network (DNN) computational framework. Exploiting the class-specific sparsity leads to a simple quan...
متن کاملAutomatic Detection of Mispronunciation in non-native Swedish Speech
This contribution presents part of the work initiated at CTT on the development of speech technology to assist non-native speakers in learning Swedish. This study mainly focuses on the automatic evaluation of mispronunciations at a phonetic level. We describe a new database we have collected for this work. Then we report the reliability of several phonetic scores to locate automatically segment...
متن کاملASR Systems as Models of Phonetic Category Perception in Adults
Adult speech perception is tuned to efficiently process native phonetic categories, causing difficulties with certain non-native categories. For example, Japanese has no equivalent of the distinction between American English /r/ and /l/ and native speakers of Japanese have a hard time discriminating between these two sounds. Here, we ask whether standard Automatic Speech Recognition (ASR) syste...
متن کاملReliability of non-native speech automatic segmentation for prosodic feedback
This paper investigates the reliability of phonetic boundaries obtained through automatic segmentation of non-native speech for automatic prosodic feedback for foreign language learning. Indeed, prosodic feedback requires checking the fundamental frequency and the duration of phonetic segments of the learner utterances with respect to some reference patterns. Segmentation evaluations carried ou...
متن کاملAssessment of Non-native Prosody for Spanish as L2 using quantitative scores and perceptual evaluation
In this work we present SAMPLE, a new pronunciation database of Spanish as L2, and first results on the automatic assessment of Nonnative prosody. Listen and repeat and read tasks are carried out by native and foreign speakers of Spanish. The corpus has been designed to support comparative studies and evaluation of automatic pronunciation error assessment both at phonetic and prosodic level. Fo...
متن کامل